ImpactMojo
Premium

Common Mistakes & Diagnostics Guide

Avoiding Pitfalls in Econometric Analysis

🎯 How to Use This Guide

This guide covers the most common mistakes in econometric analysis and how to avoid them. Each mistake includes:

Severity Levels:

🔴 Interpretation and Causality Errors

🚫 Claiming Causation from Correlation

The Problem

Interpreting regression coefficients as causal effects when identification assumptions aren't met. This is the most fundamental error in applied econometrics.

❌ Wrong

"The regression shows that education causes a 7.8% increase in income."

Problem: Ignores ability bias, family background, and selection.

✅ Right

"Education is associated with 7.8% higher income. If we could eliminate ability bias and other confounders, this might represent a causal effect."

How to Avoid
  • Always state identification assumptions explicitly
  • Use causal language only when assumptions are credible
  • Acknowledge potential confounders and their likely direction of bias
  • Consider alternative explanations for your results

🎯 Misinterpreting Log-Linear Coefficients

The Problem

Incorrectly interpreting coefficients in log-linear regressions, especially for large effects or binary variables.

Coefficient ❌ Wrong Interpretation ✅ Correct Interpretation
β = 0.08 8% increase 8 percentage point increase in log(Y), ≈ 8.3% increase in Y
β = 0.5 (binary X) 50% increase 65% increase: (e^0.5 - 1) × 100 = 64.9%
β = -0.2 20% decrease 18% decrease: (e^-0.2 - 1) × 100 = -18.1%
* Stata: Converting log coefficients correctly display "Percentage change: " (exp(0.08) - 1)*100 // 8.33% display "For binary variable: " (exp(0.5) - 1)*100 // 64.9% # R: Converting log coefficients (exp(0.08) - 1) * 100 # 8.33% (exp(0.5) - 1) * 100 # 64.9%

📊 Confusing Statistical vs Economic Significance

The Problem

Focusing only on p-values without considering economic magnitude or practical importance.

🚩 Red Flags
  • Reporting only that results are "significant at 1%"
  • With large samples, tiny effects become "significant"
  • Ignoring confidence intervals and effect sizes
✅ Best Practice
  • Always report effect sizes in meaningful units
  • Compare effect sizes to relevant benchmarks
  • Consider cost-effectiveness and policy relevance
  • Report confidence intervals, not just point estimates

🟡 Specification and Model Selection

⚙️ Including Bad Controls

The Problem

Controlling for variables that are outcomes of the treatment (bad controls) or that create collider bias.

Control Type When to Include When NOT to Include
Pre-treatment variables Age, baseline education, family background Never exclude pre-treatment confounders
Post-treatment outcomes NEVER - these are bad controls Income when studying effect of education
Mediators Only in mediation analysis Job search when studying training effects
Colliders NEVER - creates bias Selection into sample based on treatment
🔍 Control Variable Checklist
Variables are measured before treatment
Variables affect both treatment and outcome
Variables are not outcomes of the treatment
Variables are not mediators (unless doing mediation analysis)
Variables don't create selection bias (colliders)

🔍 Data Mining and P-Hacking

The Problem

Testing many specifications until finding significant results, then reporting only the "best" model without acknowledging the search process.

🚩 Warning Signs
  • Results are barely significant (p = 0.049)
  • Many alternative specifications were tested
  • Outcome definitions changed after seeing data
  • Subgroup analysis seems post-hoc
  • Sample restrictions appear arbitrary
Prevention Strategies
  • Pre-register analysis plans before seeing outcome data
  • Report all specifications tested, not just significant ones
  • Adjust for multiple testing when testing many hypotheses
  • Use hold-out samples for exploratory analysis
  • Focus on effect sizes and robustness, not just significance
* Stata: Multiple testing correction bonferroni: reg outcome treatment controls, by(subgroup) # R: Multiple testing correction library(p.adjust) p_values <- c(0.02, 0.04, 0.01, 0.08) p.adjust(p_values, method = "bonferroni")

🔵 Assumption Violations and Diagnostics

📈 Ignoring Assumption Violations

The Problem

Running regressions without checking whether key assumptions are satisfied, leading to invalid inference.

🔍 Essential Diagnostic Tests
Linearity: Plot residuals vs fitted values
Homoskedasticity: Breusch-Pagan test, White test
Normality: Histogram of residuals, Shapiro-Wilk test
Independence: Durbin-Watson test for serial correlation
Multicollinearity: VIF test, correlation matrix
Outliers: Cook's distance, leverage plots
* Stata: Comprehensive diagnostics reg y x controls predict resid, residuals predict fitted, xb * Heteroskedasticity tests estat hettest // Breusch-Pagan estat imtest // White test * Multicollinearity estat vif * Outliers predict leverage, leverage predict cooksd, cooksd # R: Comprehensive diagnostics model <- lm(y ~ x + controls, data = data) # Built-in diagnostic plots plot(model) # Individual tests library(car) ncvTest(model) # Non-constant variance vif(model) # Multicollinearity outlierTest(model) # Outliers

🔗 Weak Instruments in IV Analysis

The Problem

Using instruments that are only weakly correlated with the endogenous variable, leading to biased and imprecise estimates.

First Stage F-stat Interpretation Action Needed
F > 104 Strong instrument (< 5% bias) Proceed with IV
10 < F < 104 Weak instrument (5-10% bias) Report weak IV robust tests
F < 10 Very weak instrument Find better instrument or abandon IV
Best Practices for IV
  • Always report first-stage F-statistics
  • Use Stock-Yogo critical values for weak IV tests
  • Consider LIML or Fuller-k estimators for weak IV
  • Test overidentifying restrictions if multiple instruments
  • Discuss exclusion restriction credibility extensively

📊 Wrong Standard Errors

The Problem

Using inappropriate standard errors that don't account for data structure, leading to wrong statistical inference.

Data Structure Standard Error Type Stata Command R Command
Heteroskedastic errors Robust (Huber-White) reg y x, robust lm_robust(y ~ x, se_type = "HC1")
Clustered data Cluster-robust reg y x, cluster(id) lm_robust(y ~ x, clusters = id)
Panel data Panel-robust xtreg y x, fe robust feols(y ~ x | id, vcov = "hetero")
Survey data Survey weights svy: reg y x svyglm(y ~ x, design = survey_design)

🟡 Method-Specific Mistakes

📈 Parallel Trends Violation in DiD

The Problem

Assuming parallel trends without testing, or proceeding with DiD when pre-trends are clearly different.

🔍 Pre-Trends Testing
Plot raw trends for treatment and control groups
Run event study regression with pre-treatment leads
Test joint significance of pre-treatment coefficients
Check if trends differ statistically
Consider alternative control groups if trends don't match
* Stata: Event study for pre-trends testing gen rel_time = year - first_treat_year reg outcome i.rel_time##i.treated i.year i.district test 1.rel_time#1.treated 2.rel_time#1.treated // Test pre-trends # R: Event study library(fixest) data$rel_time <- data$year - data$first_treat_year model <- feols(outcome ~ i(rel_time, treated, ref = -1) | year + district, data) iplot(model)

📏 Bandwidth Gaming in RDD

The Problem

Choosing bandwidth to get desired results rather than using data-driven optimal selection.

🚩 Suspicious Practices
  • Reporting only one bandwidth without sensitivity analysis
  • Choosing bandwidth that barely makes results significant
  • Not using data-driven bandwidth selection methods
  • Asymmetric windows around the cutoff
RDD Best Practices
  • Use optimal bandwidth selection (Imbens-Kalyanaraman)
  • Show results for multiple bandwidths
  • Test for manipulation around cutoff (McCrary test)
  • Check covariate balance at cutoff
  • Use local polynomial with bias correction
* Stata: Proper RDD analysis rdrobust outcome running_var, c(cutoff) rdrobust outcome running_var, c(cutoff) h(5 10 15) // Multiple bandwidths rddensity running_var, c(cutoff) // McCrary test # R: Proper RDD analysis library(rdrobust) rdrobust(outcome, running_var, c = cutoff) rdplot(outcome, running_var, c = cutoff) rddensity(running_var, c = cutoff)

🔵 Presentation and Reporting

📋 Poor Table and Figure Quality

The Problem

Regression tables and figures that are hard to read, poorly labeled, or missing essential information.

✅ Table Best Practices
  • Include standard errors in parentheses below coefficients
  • Mark significance levels clearly (* p<0.1, ** p<0.05, *** p<0.01)
  • Report number of observations, R², and F-statistics
  • Use meaningful variable names and labels
  • Include fixed effects and control variable notes
  • Report confidence intervals for key estimates
* Stata: Professional table output eststo clear reg outcome treatment controls eststo model1 reg outcome treatment controls i.year eststo model2 esttab model1 model2 using "results.tex", /// se star(* 0.10 ** 0.05 *** 0.01) /// stats(N r2, labels("Observations" "R-squared")) /// title("Impact of Treatment on Outcome") /// replace # R: Professional table output library(stargazer) model1 <- lm(outcome ~ treatment + controls, data) model2 <- lm(outcome ~ treatment + controls + factor(year), data) stargazer(model1, model2, type = "latex", star.cutoffs = c(0.05, 0.01, 0.001), title = "Impact of Treatment on Outcome")

📝 Inadequate Robustness Discussion

The Problem

Not acknowledging limitations or discussing what could go wrong with the analysis.

✅ Robustness Section Checklist
Acknowledge key identification assumptions
Discuss potential sources of bias
Report alternative specifications
Test sensitivity to sample restrictions
Consider external validity limitations
Report results that don't support main findings
🎯 Final Advice: The Credibility Checklist

Before submitting any econometric analysis, ask yourself:

If you can't answer "yes" to all these questions, keep working on your analysis.